ENH: maybe_convert_objects seen NaT speed-up #27300

BeforeFlight · 2019-07-08T22:34:58Z

closes ENH: maybe_convert_objects seen NaT speed-up #27299
tests added / passed
passes black pandas
passes git diff upstream/master -u -- "*.py" | flake8 --diff
whatsnew entry

jreback

can you add an asv that hits this case

BeforeFlight · 2019-07-08T23:01:37Z

Will add. Should I add 'what's new' entry as well?

Also - in which file should I add asv - algorithms.py?

jreback · 2019-07-08T23:04:32Z

Will add. Should I add 'what's new' entry as well?

yes that would be great; 0.25.0 performance section.

BeforeFlight · 2019-07-09T00:31:11Z

$ asv continuous master maybe_convert_objects_ENH -f 1.1 -b algorithms.MaybeConvertObjects
· Creating environments
· Discovering benchmarks
·· Uninstalling from conda-py3.6-Cython-matplotlib-numexpr-numpy-openpyxl-pytables-pytest-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt
·· Building 56002cdd <maybe_convert_objects_ENH> for conda-py3.6-Cython-matplotlib-numexpr-numpy-openpyxl-pytables-pytest-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt.......................................................
·· Installing 56002cdd <maybe_convert_objects_ENH> into conda-py3.6-Cython-matplotlib-numexpr-numpy-openpyxl-pytables-pytest-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt..
· Running 2 total benchmarks (2 commits * 1 environments * 1 benchmarks)
[  0.00%] · For pandas commit c64c9cb4 <master> (round 1/2):
[  0.00%] ·· Building for conda-py3.6-Cython-matplotlib-numexpr-numpy-openpyxl-pytables-pytest-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt..........................................................
[  0.00%] ·· Benchmarking conda-py3.6-Cython-matplotlib-numexpr-numpy-openpyxl-pytables-pytest-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt
[ 25.00%] ··· Running (algorithms.MaybeConvertObjects.time_maybe_convert_objects--).
[ 25.00%] · For pandas commit 56002cdd <maybe_convert_objects_ENH> (round 1/2):
[ 25.00%] ·· Building for conda-py3.6-Cython-matplotlib-numexpr-numpy-openpyxl-pytables-pytest-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt...
[ 25.00%] ·· Benchmarking conda-py3.6-Cython-matplotlib-numexpr-numpy-openpyxl-pytables-pytest-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt
[ 50.00%] ··· Running (algorithms.MaybeConvertObjects.time_maybe_convert_objects--).
[ 50.00%] · For pandas commit 56002cdd <maybe_convert_objects_ENH> (round 2/2):
[ 50.00%] ·· Benchmarking conda-py3.6-Cython-matplotlib-numexpr-numpy-openpyxl-pytables-pytest-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt
[ 75.00%] ··· algorithms.MaybeConvertObjects.time_maybe_convert_objects                                                                                                                                   17.1±0.3μs
[ 75.00%] · For pandas commit c64c9cb4 <master> (round 2/2):
[ 75.00%] ·· Building for conda-py3.6-Cython-matplotlib-numexpr-numpy-openpyxl-pytables-pytest-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt...
[ 75.00%] ·· Benchmarking conda-py3.6-Cython-matplotlib-numexpr-numpy-openpyxl-pytables-pytest-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt
[100.00%] ··· algorithms.MaybeConvertObjects.time_maybe_convert_objects                                                                                                                                   17.5±0.8ms
       before           after         ratio
     [c64c9cb4]       [56002cdd]
     <master>         <maybe_convert_objects_ENH>
-      17.5±0.8ms       17.1±0.3μs     0.00  algorithms.MaybeConvertObjects.time_maybe_convert_objects

SOME BENCHMARKS HAVE CHANGED SIGNIFICANTLY.
PERFORMANCE INCREASED.

WillAyd · 2019-07-09T01:08:17Z

doc/source/whatsnew/v0.25.0.rst

@@ -939,7 +939,7 @@ Performance improvements
 - Improved performance by removing the need for a garbage collect when checking for ``SettingWithCopyWarning`` (:issue:`27031`)
 - For :meth:`to_datetime` changed default value of cache parameter to ``True`` (:issue:`26043`)
 - Improved performance of :class:`DatetimeIndex` and :class:`PeriodIndex` slicing given non-unique, monotonic data (:issue:`27136`).
-
+- Improved performance of :meth:`pandas._libs.lib.maybe_convert_objects` for the case when input contains ``NaT``.


Do you know what hits this from a user perspective? This is a private method which we wouldn’t mention in a what’snew

What should I write instead?

Depends on what would touch this from user code. I see DataFrame.from_tuples and maybe GroupBy ops with datetimelike objects in the result - do you see a difference when using either of those?

best way is to run the entire asv suite (takes an hour or so)
and see what changes

BeforeFlight · 2019-07-09T01:22:20Z

I believe imports should be in this order:

import pandas as pd
from pandas._libs import lib
from pandas.util import testing as tm

But when I run isort --recursive --check-only pandas locally it only prints Skipped 3 files without errors. Or am I checked it wrong?

BeforeFlight · 2019-07-09T01:39:25Z

Also maybe add isort --recursive --check-only pandas to the initial list of TODO's of PRs? Along with black pandas and git diff upstream/master -u -- "*.py" | flake8 --diff.

BeforeFlight · 2019-07-09T16:33:53Z

Full asv

asv continuous -f 1.1 master maybe_convert_objects_ENH

returns:

+      14.1±0.1ms       19.8±0.4ms     1.41  strings.Cat.time_cat(0, None, '-', 0.0)
+      2.92±0.1ms       4.06±0.3ms     1.39  algorithms.Quantile.time_quantile(0.5, 'midpoint', 'uint')
+        479±10μs         665±70μs     1.39  categoricals.Constructor.time_from_codes_all_int8
+     2.78±0.08ms       3.84±0.4ms     1.38  algorithms.Quantile.time_quantile(0.5, 'midpoint', 'int')
+      14.2±0.4ms       19.6±0.9ms     1.38  strings.Cat.time_cat(0, None, None, 0.0)
+      14.2±0.2ms       19.5±0.5ms     1.38  strings.Cat.time_cat(0, ',', None, 0.0)
+     4.62±0.07ms       6.30±0.4ms     1.37  algorithms.Quantile.time_quantile(0.5, 'midpoint', 'float')
+      14.3±0.3ms       19.4±0.7ms     1.35  strings.Cat.time_cat(0, ',', '-', 0.0)
+     2.47±0.08ms       3.31±0.3ms     1.34  algorithms.Quantile.time_quantile(1, 'nearest', 'float')
+      66.1±0.5ms        83.5±10ms     1.26  strings.Methods.time_rfind
+        729±60μs        913±200μs     1.25  ctors.SeriesConstructors.time_series_constructor(<function list_of_str at 0x7f6cf3692bf8>, False, 'int')
+     2.87±0.04ms       3.55±0.2ms     1.24  algorithms.Quantile.time_quantile(0.5, 'higher', 'uint')
+        894±30μs      1.10±0.02ms     1.23  indexing.NonNumericSeriesIndexing.time_getitem_pos_slice('string', 'unique_monotonic_inc')
+     9.73±0.09ms       11.7±0.9ms     1.21  series_methods.ValueCounts.time_value_counts('object')
+      7.08±0.1ms       8.46±0.2ms     1.20  timeseries.ResampleSeries.time_resample('datetime', '5min', 'mean')
+        814±80μs        964±200μs     1.18  ctors.SeriesConstructors.time_series_constructor(<function list_of_str at 0x7f6cf3692bf8>, True, 'int')
+      25.4±0.3ms         29.6±2ms     1.16  categoricals.Indexing.time_reindex
+      3.00±0.1ms       3.46±0.1ms     1.15  rolling.Quantile.time_quantile('Series', 1000, 'int', 1, 'nearest')
+      4.01±0.2ms       4.51±0.7ms     1.12  ctors.SeriesConstructors.time_series_constructor(<function gen_of_tuples at 0x7f6cf36a9400>, False, 'int')
+         813±9μs         913±10μs     1.12  series_methods.IsInForObjects.time_isin_nans
+      2.60±0.1ms       2.92±0.2ms     1.12  ctors.SeriesConstructors.time_series_constructor(<class 'list'>, False, 'int')
+     2.66±0.07ms      2.97±0.06ms     1.12  ctors.SeriesConstructors.time_series_constructor(<class 'list'>, True, 'int')
+     4.86±0.04ms      5.42±0.09ms     1.12  timeseries.ToDatetimeISO8601.time_iso8601_format_no_sep
+         142±3μs          158±9μs     1.11  ctors.SeriesConstructors.time_series_constructor(<function no_change at 0x7f6cf3692b70>, True, 'int')
+        866±80μs         955±70μs     1.10  ctors.SeriesConstructors.time_series_constructor(<function list_of_str at 0x7f6cf3692bf8>, True, 'float')
-         262±3μs          238±9μs     0.91  groupby.GroupByMethods.time_dtype_as_group('float', 'all', 'direct')
-        575±10μs          521±6μs     0.91  groupby.GroupByMethods.time_dtype_as_group('object', 'head', 'direct')
-        91.5±5ms         82.8±1ms     0.90  io.sql.WriteSQLDtypes.time_to_sql_dataframe_column('sqlalchemy', 'datetime')
-        641±10μs          580±3μs     0.90  groupby.GroupByMethods.time_dtype_as_group('float', 'tail', 'transformation')
-     1.68±0.04μs      1.52±0.05μs     0.90  period.PeriodProperties.time_property('min', 'hour')
-         336±5μs          303±4μs     0.90  offset.OffsetDatetimeIndexArithmetic.time_add_offset(<YearBegin: month=1>)
-         155±6ms        140±0.3ms     0.90  io.csv.ReadCSVDInferDatetimeFormat.time_read_csv(False, 'custom')
-      50.4±0.7ms         45.3±1ms     0.90  groupby.Nth.time_frame_nth_any('object')
-        467±20μs          420±9μs     0.90  groupby.GroupByMethods.time_dtype_as_group('int', 'min', 'transformation')
-      1.65±0.1μs      1.49±0.02μs     0.90  period.PeriodProperties.time_property('min', 'minute')
-        353±30μs          317±8μs     0.90  indexing_engines.NumericEngineIndexing.time_get_loc((<class 'pandas._libs.index.UInt32Engine'>, <class 'numpy.uint32'>), 'non_monotonic')
-         216±2μs          193±6μs     0.90  groupby.GroupByMethods.time_dtype_as_group('object', 'size', 'transformation')
-     1.60±0.01ms      1.43±0.05ms     0.89  groupby.GroupByMethods.time_dtype_as_group('int', 'value_counts', 'direct')
-         491±6ns          439±5ns     0.89  timestamp.TimestampProperties.time_days_in_month(tzutc(), 'B')
-        172±20ms          153±2ms     0.89  io.json.ToJSON.time_delta_int_tstamp('split')
-        497±10ns          442±6ns     0.89  timestamp.TimestampProperties.time_days_in_month(<UTC>, 'B')
-        83.6±7μs         74.4±2μs     0.89  inference.ToNumeric.time_from_float('ignore')
-      7.96±0.1ms       7.08±0.1ms     0.89  rolling.VariableWindowMethods.time_rolling('DataFrame', '50s', 'int', 'kurt')
-      6.15±0.5μs      5.46±0.06μs     0.89  io.hdf.HDFStoreDataFrame.time_store_repr
-         178±7ms          158±2ms     0.89  io.sql.SQL.time_to_sql_dataframe('sqlalchemy')
-        432±10ms          384±3ms     0.89  io.json.ReadJSONLines.time_read_json_lines_concat('int')
-     1.71±0.02μs      1.52±0.01μs     0.89  period.PeriodProperties.time_property('M', 'year')
-         494±4ns          437±6ns     0.89  timestamp.TimestampProperties.time_days_in_month(None, None)
-        477±30μs          422±3μs     0.89  indexing_engines.NumericEngineIndexing.time_get_loc((<class 'pandas._libs.index.Int64Engine'>, <class 'numpy.int64'>), 'non_monotonic')
-      6.67±0.2ms       5.89±0.2ms     0.88  indexing.Take.time_take('int')
-         376±4ms          332±4ms     0.88  groupby.GroupByMethods.time_dtype_as_group('int', 'skew', 'transformation')
-      3.45±0.1ms      3.04±0.07ms     0.88  indexing.NumericSeriesIndexing.time_getitem_list_like(<class 'pandas.core.indexes.numeric.Int64Index'>, 'nonunique_monotonic_inc')
-      1.39±0.04s       1.23±0.01s     0.88  join_merge.MergeAsof.time_multiby('nearest')
-        821±10μs         724±20μs     0.88  groupby.GroupByMethods.time_dtype_as_group('float', 'sum', 'transformation')
-         462±4μs         407±20μs     0.88  groupby.GroupByMethods.time_dtype_as_group('float', 'last', 'transformation')
-      5.03±0.1ms      4.42±0.08ms     0.88  io.csv.ReadUint64Integers.time_read_uint64
-         280±4μs          246±5μs     0.88  groupby.GroupByMethods.time_dtype_as_group('float', 'count', 'direct')
-      3.33±0.2ms      2.92±0.08ms     0.88  io.csv.ReadCSVCachedParseDates.time_read_csv_cached(True)
-         375±9ms         328±10ms     0.88  groupby.GroupByMethods.time_dtype_as_group('int', 'skew', 'direct')
-      1.69±0.1μs      1.48±0.03μs     0.88  period.PeriodProperties.time_property('M', 'dayofweek')
-         149±7μs          130±1μs     0.87  series_methods.NanOps.time_func('skew', 1000, 'int8')
-        639±30μs         558±10μs     0.87  groupby.GroupByMethods.time_dtype_as_group('int', 'nunique', 'transformation')
-      2.22±0.2ms      1.93±0.01ms     0.87  reshape.SparseIndex.time_unstack
-      2.82±0.2ms      2.46±0.03ms     0.87  reshape.SimpleReshape.time_unstack
-        937±20μs         818±10μs     0.87  indexing.NumericSeriesIndexing.time_loc_slice(<class 'pandas.core.indexes.numeric.Int64Index'>, 'nonunique_monotonic_inc')
-         484±7μs          422±6μs     0.87  groupby.GroupByMethods.time_dtype_as_group('float', 'max', 'direct')
-         284±7μs          248±7μs     0.87  groupby.GroupByMethods.time_dtype_as_group('int', 'shift', 'transformation')
-      4.20±0.1ms      3.66±0.02ms     0.87  join_merge.Merge.time_merge_dataframe_integer_key(True)
-     2.52±0.08ms      2.19±0.04ms     0.87  io.csv.ReadCSVFloatPrecision.time_read_csv(',', '.', None)
-      1.70±0.1μs      1.47±0.02μs     0.87  period.PeriodProperties.time_property('min', 'month')
-         266±2μs          231±5μs     0.87  groupby.GroupByMethods.time_dtype_as_group('float', 'any', 'direct')
-         160±9ms          139±2ms     0.87  io.json.ToJSON.time_floats_with_dt_index_lines('split')
-        27.6±2ms       23.9±0.6ms     0.87  io.sql.WriteSQLDtypes.time_to_sql_dataframe_column('sqlite', 'float')
-      12.8±0.9ms       11.1±0.2ms     0.87  reindex.LibFastZip.time_lib_fast_zip
-         194±2μs          168±4μs     0.87  indexing.NumericSeriesIndexing.time_ix_slice(<class 'pandas.core.indexes.numeric.Float64Index'>, 'unique_monotonic_inc')
-        283±20μs          245±5μs     0.86  groupby.GroupByMethods.time_dtype_as_group('float', 'shift', 'direct')
-     1.72±0.08μs      1.49±0.01μs     0.86  period.PeriodProperties.time_property('M', 'dayofyear')
-        971±70μs         839±10μs     0.86  indexing.NumericSeriesIndexing.time_loc_slice(<class 'pandas.core.indexes.numeric.Float64Index'>, 'nonunique_monotonic_inc')
-         224±1μs          193±2μs     0.86  groupby.GroupByMethods.time_dtype_as_group('float', 'size', 'transformation')
-       654±200μs         563±20μs     0.86  groupby.GroupByMethods.time_dtype_as_group('float', 'tail', 'direct')
-        875±20μs          753±6μs     0.86  indexing.NonNumericSeriesIndexing.time_getitem_pos_slice('string', 'non_monotonic')
-     1.33±0.05ms      1.14±0.01ms     0.86  groupby.GroupByMethods.time_dtype_as_field('datetime', 'rank', 'direct')
-        47.5±6ms       40.8±0.9ms     0.86  plotting.TimeseriesPlotting.time_plot_regular_compat
-         217±5μs          186±1μs     0.86  groupby.GroupByMethods.time_dtype_as_field('datetime', 'size', 'direct')
-        32.7±1ms       28.0±0.9ms     0.86  io.sql.WriteSQLDtypes.time_to_sql_dataframe_column('sqlite', 'float_with_nan')
-      4.00±0.3ms      3.42±0.03ms     0.86  rolling.VariableWindowMethods.time_rolling('DataFrame', '1d', 'int', 'mean')
-       1.33±0.1s       1.14±0.01s     0.86  join_merge.MergeAsof.time_multiby('forward')
-        654±40μs         558±20μs     0.85  groupby.GroupByMethods.time_dtype_as_group('float', 'quantile', 'direct')
-     1.47±0.09ms      1.25±0.06ms     0.85  groupby.GroupByMethods.time_dtype_as_group('object', 'value_counts', 'direct')
-     1.50±0.04ms      1.28±0.02ms     0.85  groupby.GroupByMethods.time_dtype_as_group('datetime', 'rank', 'transformation')
-     3.21±0.04ms      2.73±0.09ms     0.85  indexing.NumericSeriesIndexing.time_loc_list_like(<class 'pandas.core.indexes.numeric.UInt64Index'>, 'nonunique_monotonic_inc')
-        499±20μs         424±10μs     0.85  groupby.GroupByMethods.time_dtype_as_group('float', 'min', 'transformation')
-     1.32±0.07ms      1.12±0.01ms     0.85  indexing.DataFrameNumericIndexing.time_bool_indexer
-        371±10μs          314±6μs     0.85  indexing.NumericSeriesIndexing.time_ix_scalar(<class 'pandas.core.indexes.numeric.Int64Index'>, 'nonunique_monotonic_inc')
-        20.5±5μs       17.4±0.2μs     0.85  index_object.Indexing.time_slice_step('Float')
-        70.2±7μs         59.4±2μs     0.85  series_methods.NanOps.time_func('argmax', 1000, 'int32')
-        503±50μs          425±6μs     0.84  stat_ops.SeriesOps.time_op('mean', 'float', True)
-        413±20ms         348±10ms     0.84  io.stata.StataMissing.time_write_stata('tq')
-        205±10ms          173±3ms     0.84  io.json.ToJSON.time_float_int_str_lines('split')
-        27.8±1ms       23.4±0.2ms     0.84  io.sql.WriteSQLDtypes.time_to_sql_dataframe_column('sqlite', 'string')
-        138±10ms          116±1ms     0.84  reshape.Cut.time_qcut_int(1000)
-        70.7±7μs         59.5±3μs     0.84  series_methods.NanOps.time_func('argmax', 1000, 'int8')
-        801±30μs          674±8μs     0.84  io.parsers.ConcatDateCols.time_check_concat('AAAA', 1)
-      2.37±0.2ms      2.00±0.03ms     0.84  groupby.GroupByMethods.time_dtype_as_group('int', 'pct_change', 'direct')
-        633±40μs          530±3μs     0.84  groupby.GroupByMethods.time_dtype_as_group('datetime', 'nunique', 'direct')
-     1.85±0.06μs      1.55±0.02μs     0.84  period.PeriodProperties.time_property('M', 'qyear')
-        599±10μs         501±20μs     0.84  groupby.GroupByMethods.time_dtype_as_group('object', 'tail', 'direct')
-      1.76±0.2μs      1.47±0.01μs     0.84  period.PeriodProperties.time_property('min', 'dayofyear')
-        42.1±5ms       35.1±0.9ms     0.83  io.sql.SQL.time_read_sql_query('sqlalchemy')
-      3.55±0.3ms      2.95±0.08ms     0.83  io.parsers.ConcatDateCols.time_check_concat(1234567890, 2)
-      9.72±0.9ms       8.06±0.5ms     0.83  series_methods.NanOps.time_func('std', 1000000, 'int32')
-        60.8±5ms       50.2±0.8ms     0.82  io.sql.ReadSQLTable.time_read_sql_table_all
-     1.56±0.06ms      1.28±0.02ms     0.82  groupby.GroupByMethods.time_dtype_as_group('float', 'rank', 'direct')
-        601±10μs         493±10μs     0.82  indexing.NumericSeriesIndexing.time_ix_scalar(<class 'pandas.core.indexes.numeric.Float64Index'>, 'nonunique_monotonic_inc')
-     5.00±0.03ms      4.09±0.05ms     0.82  indexing.NonNumericSeriesIndexing.time_getitem_list_like('string', 'non_monotonic')
-        83.2±5ms         68.1±2ms     0.82  io.hdf.HDFStoreDataFrame.time_read_store_table_mixed
-        63.9±6ms         52.3±1ms     0.82  io.sql.WriteSQLDtypes.time_to_sql_dataframe_column('sqlalchemy', 'string')
-         123±6ms          101±2ms     0.82  multiindex_object.Duplicated.time_duplicated
-      1.22±0.06s         992±10ms     0.81  join_merge.I8Merge.time_i8merge('left')
-        231±20μs          188±3μs     0.81  groupby.GroupByMethods.time_dtype_as_group('int', 'size', 'transformation')
-        134±10μs          109±3μs     0.81  series_methods.NanOps.time_func('argmax', 1000, 'float64')
-      8.49±0.1ms       6.88±0.2ms     0.81  io.sas.SAS.time_read_msgpack('xport')
-        289±10μs         233±10μs     0.81  groupby.GroupByMethods.time_dtype_as_group('float', 'shift', 'transformation')
-        225±20μs          181±4μs     0.81  indexing.NumericSeriesIndexing.time_loc_scalar(<class 'pandas.core.indexes.numeric.UInt64Index'>, 'unique_monotonic_inc')
-     1.10±0.06ms         890±10μs     0.81  indexing.NumericSeriesIndexing.time_ix_slice(<class 'pandas.core.indexes.numeric.Int64Index'>, 'nonunique_monotonic_inc')
-        351±20μs          281±6μs     0.80  inference.NumericInferOps.time_add(<class 'numpy.uint8'>)
-        37.2±5ms       29.8±0.5ms     0.80  io.sql.SQL.time_read_sql_query('sqlite')
-        23.4±1ms       18.7±0.8ms     0.80  multiindex_object.Integer.time_get_indexer
-        156±10ms          124±2ms     0.79  io.sas.SAS.time_read_msgpack('sas7bdat')
-      3.63±0.1ms       2.87±0.1ms     0.79  indexing.NumericSeriesIndexing.time_loc_list_like(<class 'pandas.core.indexes.numeric.Float64Index'>, 'nonunique_monotonic_inc')
-         250±7ms          198±2ms     0.79  inference.ToNumericDowncast.time_downcast('string-int', None)
-     1.19±0.03ms         944±20μs     0.79  indexing.NumericSeriesIndexing.time_ix_slice(<class 'pandas.core.indexes.numeric.Float64Index'>, 'nonunique_monotonic_inc')
-       92.7±20μs       72.9±0.7μs     0.79  inference.ToNumeric.time_from_float('coerce')
-        24.3±3ms       19.1±0.4ms     0.79  groupby.Nth.time_frame_nth('float32')
-      11.7±0.7ms       9.07±0.3ms     0.78  reindex.DropDuplicates.time_frame_drop_dups_int(True)
-        74.9±6ms         58.3±5ms     0.78  io.parsers.DoesStringLookLikeDatetime.time_check_datetimes('0.0')
-        354±30μs          273±4μs     0.77  join_merge.Concat.time_concat_empty_left(0)
-      2.01±0.2ms      1.54±0.03ms     0.77  reindex.LevelAlign.time_align_level
-         108±4ms         80.6±8ms     0.75  io.parsers.DoesStringLookLikeDatetime.time_check_datetimes('10000')
-        83.2±2ms         58.8±1ms     0.71  io.sql.WriteSQLDtypes.time_to_sql_dataframe_column('sqlalchemy', 'bool')
-      23.0±0.4ms       16.2±0.5ms     0.70  indexing.NumericSeriesIndexing.time_ix_scalar(<class 'pandas.core.indexes.numeric.UInt64Index'>, 'nonunique_monotonic_inc')
-        48.7±1ms         33.6±2ms     0.69  io.msgpack.MSGPack.time_write_msgpack
-        442±30μs          297±6μs     0.67  indexing.NumericSeriesIndexing.time_ix_slice(<class 'pandas.core.indexes.numeric.UInt64Index'>, 'unique_monotonic_inc')
-      1.98±0.3ms      1.32±0.02ms     0.67  io.parsers.ConcatDateCols.time_check_concat(1234567890, 1)
-     1.97±0.09ms      1.31±0.04ms     0.66  groupby.GroupByMethods.time_dtype_as_group('float', 'sem', 'direct')
-      1.11±0.3ms         703±10μs     0.63  groupby.GroupByMethods.time_dtype_as_group('int', 'prod', 'transformation')
-        16.2±1ms       10.2±0.2ms     0.63  reindex.DropDuplicates.time_frame_drop_dups_na(True)
-        36.1±3ms       22.5±0.6ms     0.62  io.msgpack.MSGPack.time_read_msgpack
-        3.25±2μs      1.62±0.01μs     0.50  period.PeriodProperties.time_property('min', 'week')
-        3.02±1μs      1.50±0.05μs     0.50  period.PeriodProperties.time_property('min', 'qyear')
-      16.2±0.3ms         19.1±1μs     0.00  algorithms.MaybeConvertObjects.time_maybe_convert_objects

So there are (somehow) downsides of it (or I am using asv wrong, or interpret results of asv wrong) - need recheck it.

Btw full asv gets 3-4 hours without build for my laptop. So it is not fast here.

WillAyd · 2019-07-09T16:38:28Z

Yea I think there is some noise in there - do some of the regressions even hit this code?

jreback · 2019-07-09T16:41:40Z

ok so likely is this path is not hit in our asvs

so remove the whatsnew note and looks good

WillAyd · 2019-07-09T18:01:08Z

asv_bench/benchmarks/algorithms.py

@@ -13,6 +15,19 @@
        pass


+class MaybeConvertObjects:


Not sure we even need this benchmark since it doesn't indicate anything from end user experience but up to @jreback

Break alone in PR lgtm

I want to add more tests here later as well - for more generalized cases.

jreback · 2019-07-09T20:43:12Z

thanks @BeforeFlight followups welcome.

Break added

0f30fc5

jreback requested changes Jul 8, 2019

View reviewed changes

jreback added Performance Memory or execution speed performance Timedelta Timedelta data type labels Jul 8, 2019

Asv bench and what's new entry added.

56002cd

WillAyd reviewed Jul 9, 2019

View reviewed changes

Imports order changed. What's new entry changed.

790bf8c

Fix imports order, remove what's new entry.

e8173c6

WillAyd reviewed Jul 9, 2019

View reviewed changes

jreback added this to the 0.25.0 milestone Jul 9, 2019

jreback approved these changes Jul 9, 2019

View reviewed changes

jreback merged commit 9240439 into pandas-dev:master Jul 9, 2019

Uh oh!

ENH: maybe_convert_objects seen NaT speed-up #27300

ENH: maybe_convert_objects seen NaT speed-up #27300

Uh oh!

Conversation

BeforeFlight commented Jul 8, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jreback left a comment

Choose a reason for hiding this comment

Uh oh!

BeforeFlight commented Jul 8, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jreback commented Jul 8, 2019

Uh oh!

BeforeFlight commented Jul 9, 2019

Uh oh!

WillAyd Jul 9, 2019

Choose a reason for hiding this comment

Uh oh!

BeforeFlight Jul 9, 2019

Choose a reason for hiding this comment

Uh oh!

WillAyd Jul 9, 2019

Choose a reason for hiding this comment

Uh oh!

jreback Jul 9, 2019

Choose a reason for hiding this comment

Uh oh!

BeforeFlight commented Jul 9, 2019

Uh oh!

BeforeFlight commented Jul 9, 2019

Uh oh!

BeforeFlight commented Jul 9, 2019

Uh oh!

WillAyd commented Jul 9, 2019

Uh oh!

jreback commented Jul 9, 2019

Uh oh!

WillAyd Jul 9, 2019

Choose a reason for hiding this comment

Uh oh!

BeforeFlight Jul 9, 2019

Choose a reason for hiding this comment

Uh oh!

jreback commented Jul 9, 2019

Uh oh!

Uh oh!

BeforeFlight commented Jul 8, 2019 •

edited

Loading

BeforeFlight commented Jul 8, 2019 •

edited

Loading